Goto

Collaborating Authors

 primitive action




Creating Multi-Level Skill Hierarchies in Reinforcement Learning S

Neural Information Processing Systems

They had four primitive actions: north, south, east, and west. Multi-Floor Office is an extension of Office to multiple floors. Pick-up and put-down have the intended effect when appropriate; otherwise they do not change the state. T owers of Hanoi contains four discs of different sizes, placed on three poles. Options generated using alternative methods called primitive actions directly.




0f3d014eead934bbdbacb62a01dc4831-Paper.pdf

Neural Information Processing Systems

In reinforcement learning, option models (Sutton, Precup & Singh, 1999; Precup, 2000) provide the framework for this kind of temporally abstract prediction and reasoning. Natural intelligent agents are also able to focus their attention on courses of action that are relevant or feasible in a given situation, sometimes termed affordable actions.





Appendix 1 Goal generation for executor training

Neural Information Processing Systems

The pseudo goal generation is introduced for training the executor without coordinator. The scripted policy is allowed to access the grounded state, e.g. the absolute position Note that it is not the optimal policy for the executor, it will fail when two targets are far. The notations used here are defined as follows. The objective is to maximize the number of covered targets. After formulation, we can solve the target coverage problem as an ILP problem with CBC optimizer. Then, the primitive actions for all the sensors can be derived from the results of ILP shown as Tab. 1.